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Five factor analysis procedures for dichotomous items are discussed. 
A simulation study was conducted to compare the various methods. The 
item parameters of four different sets of items were used with 
numbers of subjects set at 250, 500, and 1,000. Ten replications were 
generated for each set of item parameters and each sample size. All 
models were compared with respect to estimates of IRT and factor 
analysis parameters using six criteria in terms of mean squared 
differences between the known and estimated item parameters. The most 
striking result of the simulation study was that common factor 
analysis programs outperformed the more complex programs TESTFACT, 
MAXLOG, and NOHARM. It was apparent that a common factor analysis in 
the matrix of tetrachoric correlations yielded the best estimates. A 
procedure based on the mean squared residuals of the correlation 
matrix was also presentjd for assessing the dimensionality of the 
model. Nine tables present the data from the simulation study. A 
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Abstract 

Many multidimensional . item response models have been 
proposed in literature. The models and various methods for 
estimating the item parameters are reviewed briefly. In a 
simulation study these methods are compared with respect to 
their estimates of the item parameters. It is concluded that 
a common factor analysis on the matrix of tetrachoric 
correlations yields the best estimates. 

Additionally, a procedure based upon the mean squared 
residuals of the correlation matrix is presented for the 
assessment of the dimensionality of the model. 

Key words: Common Factor Analysis. Dichotomc Variables, 
Item Re5")onse Theory. Multidimensionality . 
Tetrachoric Correlations. 
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Empirical Comparison between Factor 
Analysi* and Item Response Models 



Introduction 

One of the main problems in constructing Rasch (1960) 
scales from a large number of dichotomously scored items is 
the multidimensionality of the itempool . Usually, the Rasch 
model does not fit the whole itempool. Procedures for 
constructing Rasch scales which start from the entire 
itempool are not very promising (cf. Knol, 1987b). A more 
promising procedure is to identify the main dimensions of the 
itempool and to start an (iterative) procedure on the 
different subsets of items separately. Verhelst (1983) and 
Knol (1987b) describe such iterative procedures. 

To identify the main dimensions of an itempool , a 
multidimensional representation of the items can be useful. 
Many multidimensional models can be used for that purpose. 
Roughly, the models can be distinguished to the extent in 
which they malce use of the information of the data matrix. 
For continuous variables, a classic! common factor analysis 
(Fa) on the matrix of product-moment correlations can be 
used. However, for dichotomous items, the matrix of pairwise 
(tetrachoric) correlations is not sufficient (Mcod, Graybill 
& Boes, 1974, pp. 299-314). Therefore, several models have 
been proposed, which do use all the available information 
contained in the response vectors. Because these so-called 

ERIC 7 
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full-information (cf. Bock, Gibbons & Muraki, 1985) models 
suffer from numerical difficulties, an approximation has been 
proposed by McDonald (1985). 

The purpose of the present paper is to compare the so- 
called full— information models with the models which use only 
pairwise information. For simulated data, various estimations 
of the item parameters will be compared. Furthermore a 
procedure to estimate the dimensionality of the models will 
also be presented. 

Firstly, a short review of multidimensional item 
response theory (IRT) models will be given. Then the various 
FA models for dichotomous variables are described. 

Multidimensional IRT Models 

Several multidimensional IRT models for dichotomous data 
have been proposed (Bock & Aitkin. 1981; ^ock & Lieberman. 
1970; McDonald, 1985; Mulaik, 1972; Rasch, 1961 Reckase. 
1973; Sympson, 1978; Whiteley. 1980). Generally, the models 
can be classified into so— called compensatory models, which 
allow high ability values on one dimension to compensate for 
low abilities on other dimensions , and noncompensatory 
models. These last mentioned models (Sympson, 1978; Whiteley. 
1980) do not allow high ability to compensate for low ability 
on other dimensions. Apart from the psychological 
meaningfulness of these models, the most important practical 
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disadvantage of noncompensatory models is that no efficient 
algorithms for the parameter estimates are available. 

The compensatory model of Bock and Aitkin (1981) is 
relatively simple and a marginal maximum likelihood (MML) 
procedure for the estimation of item parameters has been 

developed* Let 2 = (X^ X^)' be a random vector of 

response patterns to n dichotomous items . where each Xj^ 
(i = 1 n) is defined as 

^ I if item i is correctly answered 
i 0. otherwise . 



Under the (usual) assumption of local independence the 
marginal probability of the response vector X = X is given by 



(2) P(2; = X) = 



n Xi l-xi 

.n^ [pi(e)] [1 - pi(0)] g(0)d0 



where Pi(i) is the item characteristic function (ICF) of item 
i. g(0) is the density function of the unobserved m-component 
random vector of abilities 0, and the integration is taken 
over the entire multidimensional ability space. It is asstimed 
that 0 is multivariate normally (MVN) distributed with mean 
and covariance matrix I. In the multidimensional two- 
parameter normal ogive (M2PN0) model the ICF of item i is, 
given by 



(3) 



Pid) = F(ai'e - Pi) 
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where is the m x 1 vector of discrimination parameters for 
item i, is the difficulty parameter for item i (i=l,,,,,n) 
and F( , ) is the cumulative standard normal distribution. An 
iterative procedure to obtain MML estimates of the item 
parameters via the EM algorithm (Dempster, Laird & Rubin. 
1977) has been implemented in the computer program TESTFACT 
(Wilson, Wood & Gibbons, 1984), 

An IRT model that uses only information contained in the 
pairwise proportions is based upon McDonald's (1985) harmonic 
analysis, A computer program NOHARM II (Eraser, 1988) is 
available in which the pairwise proportions 
^ij = P(Xj[=:l, Xj=l) are approximated by minimizing the 
unweighted least squares function 

(4) f(A, fi) = E E [pij - ^ij(0Si.Pi.aj.Pj)3^ . 

where A = (s^, s^) \ £ = (Pi Pn^'* Pij are the sample 

proportions and the ICF's are approximated by a third degree 
Hermite-Tchebychef f polynomial. 

Because of the well-known relationship between the 
logistic distribution function L (cf . Mood, Graybill & Boes. 
1974, p. 542) and the cumulative standard normal distribution 
function F 

(5) |F(2) - L(1.7z) I < 0,01 
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for all 2 (Haley. 1952). it is possible to approximate the 
normal ogive ICF (3) by the logistic ICF 

exp(c(Si*i - Pi)] 

(6) Pi(i) = = L[c(fli*i - Pi)] . 

1 + exp[c(Si'i - Pi)] 

The computer program MXLOG (McKinley & Reckase. 1983) yields 
estimations of the parameters of the multidimensional two- 
parameter logistic (M2PL) model. Because the program uses the 
method of Joint ML estimation, problems such as the so-called 
drift of the discrimination parameters may be encountered and 
estimation may be cumbersome when the number of subjects N is 
large . 

In all three programs mentioned above the numbers of 
variables and dimensions are limited. This ma]ces the programs 
not very useful for large scale applications. 



FA for Dichotomous Items 

In FA for dichotomous variables (Christof fersson. 1975; 

Muth6n» 1978), the response variables are assumed to be 

governed by the unobserved continuous variables and 
thresholds as 



(7) Xi = 



1. if Yi > Ti 
0. otherwise , 
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where 

' (8) I = Al -f E . 

and I = (Y2 Yq)*. Model (8) is the usual random factors 

FA model, the only difference being that Y is unobserved. 
Under the assumptions 

(9a) 4 - MVN (0. I) . 

(9b) E - MVN (0. ^2) ^ 

where ^2 ^ diagonal matrix with positive diagonal 

elements, and 

(9c) cov(e. E) = . 

the covariance matrix Z among X variables is given by 

(10) E = AA* + H'^ . 
Hence, 

(11) I - hVN(Q. AA* + 4^2) . 

In Fk for dichotomous variables the marginal probability of 
response pattern 2 = X is 

j2 
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(12) P(X = X) = 



h(l)dl . 

Z 



where h(. ) is the MVN (fi, AA' + density and Z is the 

multidimensional integration region defined by the Cartesian 
product of Z^, such that Zj[ = (Tj^, oo) if = 1 and Z^ = 
Ti) if Xi = 0. 

Takane and De Leeuw (1987) showed the formal equivalence 
of the marginal likelihood (2) of the M2PN0 model with 
0 - MVN(fl, I) and the likelihood (12) of FA for dichotomous 
variables. The parameters of the IRT formulation and 
(1=1 , . . .n) can be expressed in cerms of the parameters of 
the FA formulation ij[ , ti and Vi as 

C13a) = ^i/Vi 



and 



(13b) Pi = Ti/Vi 

(Takane & De Leeuw. 1987). where X^' denotes row i of A and 
Vi2 is the i-th diagonal element of 4^2. Reversely, the FA 
paraiLeters can be expressed in terms of the IRT parameters as 



(14a) Xi = (1 + ai'ai)-%i . 
(14b) Ti = (1 + Zi's^O^Hi 
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and 

(14c) Vi = (1 + Si*ai)-'>4 . 

cf. also Knol (1987a). 

The parameters of the FA formulation can be estimated 
with the program LISCOMP (Muth^n. 1985). using the method of 
generalized least squares (GLS). Since LISCOMP is not yet 
available for a VAX computer the FA model for dichotomous 
items will not be treated throughout this paper . 

Common to the models treated above is the usage of all 
available information from the data matrix. If we are willing 
to use only information of the one-way marginals 
(percents-correct ) and the two-way marginals pj^^ . it is 
possible to approximate the above models by more classical 
models, e.g. models in the realm of common FA. 

If the latent continuous response variables 2. 
underlying the manifest dichotomous response variables X. are 
MVN distributed, then the ML estimator of the product-inoment 
correlation between the (bivariate normal distributed) 
variables and Yj is given by the tetrachoric correlation 
between and X ^ . Hence it seems reasonable to perform a 
common FA on the matrix of estimated tetrachoric correlations 
in order to obtain estimates for the FA parametrization of 
model (3). Estimates of the IRT parametrizaticn of the M2PN0 
model can be obtained by the transformations (14). There are. 
however, some problems connected with this approach. As 
already noted, the matrix of sample tetrachoric correlations 

ER?C 
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is not sufficient, and the estimates become unstable when the 
proportions of the 2x2 table are extreme, or the number of 
observations is low. Furthermore the matrix of tetrachoric 
correlations is not necessarily positive definite, and this 
meikes the matrix inappropriate for the ML and GLS FA methods. 
Finally , the possible occurence of one or more unique 
variances approximately equal to zero, i.e. Heywood (1931) 
cases, may be encountered. See Mislevy (1986) for an 
excellent review of these problems. 

Various FA programs are available. In SPSS^ (1986). 
iterative principal FA (Harman & Jones, 1966), minimm 
residuals or unweighted least squares FA (Harman & Jones, 
1966),, generalized least squares FA (Joreskog & Goldberger, 
1972), maximum likelihood FA (Joreskog, 1967). and alpha FA 
(Kaiser & Caffrey, 1965) are implemented. These methods will 
be denoted by IPFA, ULS,, GLS„ ML and ALPHA, respectively. In 
LISREL VI (Joreskog & Sorbom, 1984) ULS, GLS and ML methods 
are available. Additionally, an adjusted minimum residuals 
(MINRES) FA method (Harman & Jones, 1966; Zegers & Ten Berge, 
1983), in which arbitrarily lower bounds can be set on the 
unique variances (see also Knol, 1987a), has been used in the 
simulation study. An advantage of MINRES is the possibility 
to avoid Heywood cases. 

For each method estimations of th3 parameters 

A = (il 2^)*. 11^2 = (Vl2.. ...Vn2)*. X = (Ti Tn)\, 

A= (Si a^)* and £ = (pi pj^)' can be obtained by 

either the transformations (13) for the FA models or (14) for 
the IRT models. 
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A Simulation Study Comparing Methods 



To compare the various methods, data matrices were 
generated with known discrimination and difficulty 
parameters. The item parameters of four different sets of 
items are given in Table 1, where the groups of items which 
have the same discrimination parameters, have difficulties 
—2. —1, 0, i and 2, respectively. 



Multidimensional abilities 9 have been drawn from the 
MVN distribution using the procedure G05EZF of the NAG 

(1984) program library. The binary response of observation v 
(v=l, . . . ,N) on item i was obtained by 



where u^j^ is randomly drawn from the xmiform [0.1] 
distribution and p^Ciy) is given by (3) . To verify the 



set equal to 250. 500 and 1000. For each set of 



Insert Table 1 about here 



(15) 




effects due to the samp' size, the nw< 



-;r of subjects was 
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itemparameters and each sample size, 10 replications were 
generated. 

The sample tetrachoric correlations were computed by 
means of the procedure BECTR from the IMSL (1984) program 
librsury. In BECTR. a tetrachoric correlation is computed as 
the root of a sixth-degree equation . If there was no 
solution, a solution was obtained by adding tin observation to 
each cell -x: the 2x2 frequency table (cf. Mislevy, 1986). 
In the case of multiple roots, the root with the smallest 
absolute v^'alue was used. 

In the next section six criteria to diss^ss differences 
among Icnown and estimated item parameters will be given. 

Six Criteria 

All models will be compared with respect to estimates of 
both IRT and FA parameters. The criteria will be in terms of 
mean squared differences between the Icnown and estimated item 
parameters . 

In the case of orthogonal abilities i, the n x m matrix 
A of factor loadings is determined up to an orthogonal 
transformation T. If A^. "^^2, jo. Ao and fio are the icnown 
item parameters, where the dimensionality m is known, then 
the first criterion is given by 

(16a) gi(A, T) = { (mn)-ltr(AT - A^) ' (AT ^ A^) }14 , 

ERLC 
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where (16a) is minimized as a function of T under the 
constraint that T is orthogonal. If PAQ* = A'Aq denotes the 
singular values decomposition of the matrix A'Aq, then the 
minimum of (16a) is attained for T = PQ* (Green, 1952). The 
criteria for the uniqup variances )^ and the thresholds t 
are 

(16b) g2(l|/2) =/^-l(\i;2 ^y^^2)*{^ji2 - )^^2)YA 



and 



(16c) g3(T) = {n-l(T -Tq)'(t -Tq)}'^ , 

respectively. In the case of orthogonal 0. the n x m matrix A 
of discrimination^ is also determim^d up to an orthogonal 
transformation, and the first criterion for the IRT 
parametrization is 

(16d) g4(A, T) = { (nin)-ltr ( AT - Ao)'(AT - Aq))'^ . 

where (16d) is also minimized as a function of T tmder the 
constraint that T is orthogonal. The criterion for the 
difficulties is 

(16e) g5(^; = {n-l(&-^)'(« - ii^))'^ . 

The last criterion is given by the mean squared difference 
between the values of the generated and estimated ICF's 

ERIC JO 
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, N n 
v=l i=l 

where PioClv) = FCSio'iv " ?0 and (v=l N) are assumed 

to be known. 



Results 

In order to apply the GLS and ML common FA methods, the 
matrix of tetrachoric correlations has to be positive 
definite. If the matrix of tetrachoric correlations R is 
indefinite or ill-conditioned with respect to inversion, a 
smoothing procedure has to be used. Let R = KDK* be the 
eigendecomposition of R. where D is an n x n diagonal matrix 
containing the eigenvalues d^ of R in descending order and K 
is the matrix of corresponding normalized eigenvectors. Then 
a nonnegative definite correlation watrix R+(6) can be 
obtained by 

(17) R+(6) = (Diag KD+K' )-'^KD+K' (Diag KD+K' )-*^ . 

where d^'^" is diagonal element i of (i=l.....n) with 

di+ = maxCdi. 6) and 6^0. Note that if 6 = 0. ED+K' is the 
least-squares approximation to R of rank r. where r is the 
number of positive eigenvalues of R (cf. Rao. 1973). Note 
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also that R"*'(0) coincides with Frane's (1978) smoothing 
procedure . 

To investigate the effects of the smoothing procedure 
the common FA methods that do not require positive 
definiteness of R (i.e, IPFA, ULS. MINRES and ALPHA) were 
applied to 10 indefinite matrices obtained from dataset 2 
(cf. Table 1) with N = 250 and various values of 6. The 
results are given in Table 2. 



Insert Table 2 about here 



Only for MINRES a slight increase is observed. From the 
results in Table 2 it can be inferred that the effect of 
smoothing is negligible. Additionally a small decrease of the 
total number of variables with estimated unique variances 
smaller than .2 is observed for increasing values of 6. 
Therefore it was decided to perform all common FA on the same 
smoothed tetrachoric correlation matrix R"*'(.005). ensuring 
that the matrix to be analyzed is sufficiently well 
conditioned. 

In Table 3 the mean values of the six criteria over 10 
replications for the different methods are given for 
generated data corresponding to the unidimensional data set 
1. 
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Insert Table 3 about here 



Because the ULS procedures of SPSS^ and LISREL and IPFA of 
SPSS* gave the same results, only the IPFA procedure of SPSS^ 
has been reported in the tables. Only the GLS procedure from 
LISREL is reported, because it gives consistently better 
results then the correspondent SPSS* procedure. The ML 
procedxxre of SPSS* procedure was chosen because the 
corresponding LISREL procedure often did not converge to a 
proper solution. 

From the results in Table 3 it can be concluded that 
MAXLOG performs very badly. The GLS and ML procedures also 
perform badly. As expected TESTFACT is the best procedure and 
N0HA3?M also performs quite good. The procedures IPFA and 
MINRES give approximately the same results. 

The results of the various methods obtained from i;he 
multidimensional datasefcs 2, 3 and 4 are given in the Tables 
4, 5 and 6. respectively. 



Insert Tables 4-6 about here 



It has to be noted that the GLS procedure applied to data set 
4 never converged to a proper solution; hence, the outcomes 
are not reported in Table 6 . 
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Essentially the same conclusions can be drawn from the 
results for these multidimensional datasets. However, for the 
three-dimensional datasets 3 and 4 the performances of NOHARM 
and TESTFACT decrease, and in fact become even worse than the 
simple common FA procedures IPFA and MINRES, 

The Dimensionality of Binary Scored Items 

The dimensionality of binary items has been a source of 
debate in educational and psychological literature, and 
various aspects have been discussed by Goldstein (1980). 
McDonald (1981) and Hattie (1985). among others. 

The increasing interest in IRT and the widespread use of 
the one, two- and three-parameter IRT models, which all 
assume a unidimensional ability space, has increased the need 
for a clear definition of dimensionality. Moreover, reliable 
indices to assess the dimensionality of a set of binary 
scored items are needed. 

Both Goldstein (1980) and McDonald (1981) discuss the 
dimensionality of binary items in relation with \:he principle 
of local independence. The formal requirement of the 

independence of the item responses (x^ x^) is that the 

joint distribution of the responses given a vector of 
abilities 9 is equal to the product of the marginal 
distributions of the items given 9. i,e,: 

22 
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m 

(19) PCX n = ir P(Xi = Xili) 

i=l 

Goldstein (1980) states that a distinction should be 
made between the conditional distribution of the item 
responses, given 1 and the conditional distribution of item 
responses given both 5. and the responses to the other items. 
A tinidimensional model can be assumed with or without local 
independence. The assumption of local independence is really 
very strong, and Goldstein (1980) doubts whether this 
assumption is actually met in real life situations. 

On the ether hand, McDonald (1981) argues that the 
principle of local independence and the definition of 
dimensionality are related to each other. If a subject from a 
given population is completely characterized by one or more 
abilities 0 =(0i , , , . . 0jjj) , then the scores of that subject 
with the abilities 1 on the n items are mutually 
statistically independent , This means that , if these 
abilities span the complete ability space in the population, 
all mutual statistical dependencies among the n items are 

explained by thase abilities ( 0^ Qj^) , If, however . a 

model is specified with a number of abilities, which do not 
span the complete ability space, then there will still remain 
mutual dependencies among the items , An adequate method to 
specify the dimensionality of a set of binary items is 
therefore needed. Unfortunately, no all-round index to 
identify the dimensionality of binary items is available. 
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Dimensionality Indices 

The most frequently applied procedure to verify the 
dimensionality of binary items is to compute tetrachoric 
correlations ao^d to inspects the eigenvalues of the 
corresponding matrix. Sometimes even phi-coefficients are 
used (Hambleton & Rovinelli, 1986), but it is well-known that 
these coefficients are affected by the difficulties of the 
items. As already mentioned above, the use of tetrachoric 
correlations has some disadvantages. 

Dimensionality indices obtained from linear factor 
analysis will not be optimal, since IRT models for binary 
data are intrinsically nonlinear, i.e. nonlinear in the item 
parameters and in the ability parameters. 

Up to date, there are no widely accepted tests of fit 
for models formulated on binary items which are comparable 
with the x^-test and the residual analysis in common FA and 
research is needed on the assessment of misfit of 
multidimensional IRT models. Perhaps, the use of a formal x^-- 
test should be avoided, because of the distributional 
problems connected with the x^-^est for small samples and 
because the use of a test statistic is never in itself a 
sufficient justification for the acceptance or rejection of a 
certain model. 

Hambleton and Rovinelli (1986) compared some methods for 

the determination of the dimensionality c a set of items and 

concluded that linear factor analysis based on phi- 
id 
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coefficients tended to overestimate the number of underlying 
abilities and that inspection of the residuals obtained from 
a nonlinear factor analysis would be a promising approach to 
assess dimensionality. 

Hattie (1985) reviewed the rationale of various methods 
and concluded that too many indices were developed on an ad 
hoc base. Hattie (1984) reported that indices based on 
residuals obtained from nonlinear factor analysis could very 
well distinguish a unidimensional set of items from a set 
with more than one dimension and recommended the use of the 
mean squared or mean absolute residuals as a suitable loss 
fxinction. 

Tucker, Humphreys, Lloyd, and Roznowski (1986) compared 
some indices based on the eigenvalues of the tetrachoric 
correlation matrix with some indices based on the local- 
independence principle. Their preliminary results seem to 
indicate that the indices based on the eigenvalues do not 
work very well. 

A Simulation Study Assessing Dimensionality 

Following the suggestion made by Hambleton and Rovinelli 
(1986) and Hattie (1984, 1985), the residuals were used as a 
measure for dimensionality. 

If Aj^ is the estimated matrix of factor loadings from a 
solution with k estimated common factors, then the matrix of 
residuals R* = [r^j*] is 
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(19) R* = R ^ Aj^Aj^' 

and the mean squared and mean absolute residuals are 

(20a) ei = 2[n(n-l)]-l Z Z (rlo^ 

i<j ^ 

and 

(20b) e2 = 2[n(n-l)]-l E E \tIa\ . 
re^^pectively. 

For each of the firsn three datasets given in Table 1. 
with one, two and three dimensions. respectively. 
5 datamatrices were generated for sample sizes 250. 500 and 
lOOO. In the Tables 7. 8 and 9 the mean squared residuals are 
given for the three datasets. after an analysis was performed 
with assumed dimensionality ranging from one to five. 



Insert Tables 7-9 about here 



Since the mem absolute residuals and the mean squared 
residuals lead to the same conclusions, only the mean squared 
residuals are given. To verify the sizes of the obtained 
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residuals, the meem squared residuals for datasets with all 
correlations among the variables equal to zero. i.e. no 
common factors . are also given. In the Tables 7 through 9 
only the results for MINRES. NOHARM and TESTFACT are given. 
Although the results of an ordinary principal components 
analysis (PCA) will generally not give a satisfactory fit 
when applied to binary data, the pattern of residuals might 
give an adequate indication of dimensionality. Therefore the 
residuals obt'-iined after a PCA are also given in the Tables. 

From the Tables 7 through 9 relatively high values of 
the mean squared residuals e^ can be observed when datasets 
have been analyzed with a smaller dimensionality than the 
generated dimensionality. This applies to all methods. Also, 
a large drop of e^ can be observed between the analyses with 
m-1 and m dimensions (where m is the generated dimensionality 
of the dataset). No such drops of e^^ are observed for 
analyses with higher assumed dimensionality. Hence, it seems 
that the dimensionality of a dataset can be assessed by 
inspecting the mean squared residuals obtained from different 
assumed dimensionalities of the methods. 



Discussion 

The most striking result of the simulation study in 
which various IRT and FA programs were compared, is that the 
common FA methods outperformed the more complex programs 
TESTFACT. MAXLOG and NOHARM. despite their theoretical 
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advantages. Also, quite remarkable is the failure of the ML 
and GLS FA procedures compared with IPFA, ULS or hINRES. Of 
course this may be due to the implementation of the specific 
methods. Nevertheless, since no other programs are available, 
it is advised to use IPFA. ULS or MINRES. An additional 
advantage is that these programs can handle relatively large 
numbers of variables and factors. Because IPFA has some 
algorithmic drawbacks (cf. Gorsuch. 1974. p. 98) and MINRES 
(or ULS) perform equally well as IPFA. it is advised to avoid 
the usage of IPFA. An advantage of MINRES compared to ULS is 
that MINRES avoids Heywood cases. Therefore, in large-scale 
applications it is advised to use MINRES on the (possibly 
smoothed) matrix of tetrachoric correlations. 

A possible drawback of MINRES could be that no 
statistical goodness of fit measure is available, hence the 
estimation of the dimensionality of the model can be 
problematic. Therefore, a non-Scatistical procedure for 
assessing the dimensionality of the model is proposed, 
leading to essentially the same results as TESTFACT and 
NOHARM. 
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Table 1 

Discrimination parameters of the four different data se 

(each group of five items with the same discriminati 

parameters has difficulty parameters -2. -1. 0. 1. and 2) 



Set n 


m 








1 15 


1 


(5x) 


1 


.00 






(5x) 


1 


25 






(5x) 


1 


50 


2 15 


2 


(5x) 


1 


1 






(5x) 


1 


0 






(5x) 


0 


1 


3 15 


3 


(5x) 


1 


1 0 






(5x) 


1 


0 1 






(5x) 


0 


1 1 


4 30 


3 


(5x) 


1 


1 0 






(5x) 


1 


0 1 






(5x) 


0 


1 1 






(5x) 


1 


0 0 






(5x) 


0 


1 0 






(5x) 


0 


0 1 
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Table 2 

The total number of estimated unique variances <.2 and the 
mean values of two FA criteria for four CFA methods for data 



set 2 


and various 


R*(6) 


with N = 


250 








Crit 


Method 






6 








dn 


0 


.001 


.005 


.01 


.05 


#Vi2< 


2 IPPA 


4 


3 


3 


3 


3 


3 




UlS 


4 


3 


3 


3 


3 


3 




MINIiES 


3 


1 


0 


1 


2 


2 




ALPHA 


9 


7 


7 


6 


6 


4 


A 


IPFA 


.096 


.096 


.096 


.096 


.096 


.096 




UlS 


.096 


.096 


.096 


.096 


.096 


.096 




MINIiES 


.095 


.096 


.096 


.096 


.096 


.096 




ALPHA 


.104 


.103 


.103 


.103 


.103 


.103 




IPFA 


.125 


.125 


.125 


.125 


.125 


.125 




ULS 


.125 


.125 


.125 


.125 


.125 


.125 




MINIiES 


.121 


.124 


.124 


.124 


.124 


.124 




ALPHA 


.144 


.141 


.141 


.141 


.141 


.140 
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Table 3 

Mean values of the six criteria for different N and various 
methods for data set 1 



Criterion 



N 


Method 


A 


^2 


T 


A 


& 


P 


250 


IPFA 


.091 


.127 


.097 


.296 


.231 


.047 




ALPHA 


.096 


.133 


.097 


.311 


.239 


.048 




ML 


.092 


.128 


.097 


.294 


.226 


.047 




GLS 


.122 


.166 


097 






. U bo 




MINRES 


.091 


.127 


.097 


.296 


.231 


.047 




NOHARM 


.067 


.098 


.097 


.264 


.245 


.043 




TESTFACT 


.064 


.093 


.105 


.244 


.226 


.045 




MAXLOG 


.081 


.124 


.128 


.454 


.440 


.056 


500 


IPFA 


.051 


.075 


.058 


.188 


.152 


.030 




ALPHA 


.053 


.078 


.058 


.196 


.157 


.030 




ML 


.050 


.073 


.058 


.183 


.151 


.029 




GLS 


.071 


.102 


.058 


.244 


.195 


.034 




MINRES 


.051 


.075 


.058 


.188 


.152 


.030 




NOHARM 


.047 


.071 


.058 


.198 


.188 


.029 




TESTFACT 


.044 


.065 


.058 


.169 


.166 


.028 




MAXLOG 


.066 


.105 


.101 


.410 


.414 


.044 


000 


IPFA 


.031 


.047 


.044 


.122 


.108 


.021 




ALPHA 


.032 


.048 


.044 


.126 


.109 


.021 




ML 


.031 


.046 


.044 


.121 


.108 


.021 




GLS 


.036 


.054 


.044 


.140 


.127 


.022 




MINRES 


.031 


.047 


.044 


.122 


.108 


.021 




NOHARM 


.029 


.045 


.044 


.130 


.120 


.021 




TESTFACT 


.028 


.042 


.044 


.110 


.104 


.020 




MAXLOG 


.048 


.079 


.079 


.340 


.352 


.034 
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Table 4 

Mean values of the six criteria for different N and various 
methods for data set 2 



Criterion 



N 


Method 


A 




X 


A 


£ 


P 


250 


IPFA 


.103 


.130 


.093 


.239 


.237 


.057 




ALPHA 


.103 


.138 


.093 


.254 


.238 


.059 




ML 


.119 


.149 


.093 


.280 


.288 


.063 






1 1 O 


. 166 


.093 


. 283 


.271 


.063 




MINRES 


.103 


.130 


.093 


.239 


.237 


.057 




NOHARM 


.099 


.136 


.093 


.284 


.344 


,059 




TESTFACT 


.093 


.122 


.101 


.258 


.296 


,059 




MAXLOG 


.119 


.227 


.149 


.724 


.571 


.102 


500 


IPFA 


.072 


.093 


.061 


.181 


.173 


.041 




ALPHA 


.073 


.094 


.061 


.186 


.175 


.042 




ML 


.078 


.101 


.061 


.193 


.ia6 


.043 




GLS 


.080 


.109 


.061 


.192 


189 






mwEs 


.072 


.091 


.061 


.179 


.173 


.041 




NOHABM 


.070 


.092 


.061 


.192 


.210 


.042 




TESTFACT 


.070 


.089 


.065 


.167 


.181 


.041 




MAXLCX5 


.104 


.210 


.106 


.655 


.496 


.088 


1000 


IPPA 


.048 


.057 


.050 


.113 


.102 


.030 




ALPHA 


.048 


.057 


.050 


.114 


.102 


.030 




ML 


.059 


.073 


.050 


.139 


.129 


.035 




GLS 


.052 


.063 


.050 


.115 


.115 


.030 




MINRES 


.048 


.057 


.050 


.113 


.102 


.030 




NOHARM 


.047 


.058 


.050 


.120 


.121 


.030 




TESTFACT 


.048 


.061 


.054 


.116 


.112 


.031 




MAXLOG 


.099 


.205 


.087 


.604 


.488 


.082 
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Table 5 

Mean values of the six criteria for different N and various 
methods for data set 3 



Criterion 



K 


Method 


A 


^2 


1 


A 


& 


P 


250 


IPFA 


.095 


.116 


.086 


.237 


.218 


.067 




ALPHA 


.097 


.121 


.086 


.245 


.225 


.069 




ML 


.120 


.148 


.086 


.289 


.258 


.081 




GLS 


. 135 


. 204 




.003 


. ouy 


n Q o 
. UOt 




MINRES 


.095 


.112 


.086 


.235 


.218 


.067 




NOHARM 


.087 


.108 


.086 


.243 


.283 


.066 




TESTFACT 


.094 


.119 


.118 


.324 


.318 


.078 




MSILOG 


.288 


.415 


.497 


.739 


.739 


.222 


500 


IPFA 


.063 


.083 


.069 


.183 


.189 


.051 




ALPHA 


.063 


.085 


.069 


.184 


.191 


.052 




ML 


.085 


.110 


.069 


.228 


.213 


.064 




GLS 


.089 


.127 


.069 


.243 


.243 


.065 




MINRES 


.062 


.081 


.069 


.182 


.189 


.051 




NOHARM 


.060 


.079 


.069 


.182 


.220 


.051 




TESTFACT 


.066 


.098 


.080 


.207 


.225 


.055 




MAXLOG 


.260 


.397 


.498 


.694 


.871 


.218 


000 


IPFA 


.045 


.057 


.050 


.126 


.135 


.037 




ALPHA 


.045 


.058 


.050 


.127 


.134 


.037 




ML 


.045 


,059 


.050 


.128 


.140 


.037 




GLS 


.053 


.076 


.050 


.148 


.164 


.041 




MINRES 


.045 


.058 


.050 


.126 


.135 


.037 




NOHARM 


.043 


.055 


.050 


.122 


.136 


.036 




TESTFACT 


.049 


.082 


.065 


.147 


.154 


.043 




tiAZLOG 


.287 


.444 


.467 


.698 


.682 


,213 
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Table 6 

Mean values of the six criteria for different N and various 
methods for data set 4 



Criterion 



N 


Mftthod 


A 




1 


A 


£ 


P 


250 


IPFA 


.087 


.113 


.090 


.183 


.218 


.057 




ALPHA 


.089 


.118 


.090 


.192 


.223 


.059 




ML 


,087 


.113 


.090 


.185 


.224 


.057 




















MINRES 


.087 


.112 


.090 


.183 


.218 


.057 




NOHARM 


.083 . 


.101 


.090 


.202 


.263 


.058 




TESTFACT 


.092 


.126 


.121 


.209 


.285 


.067 




MAXLOG 


.159 


.211 


.187 


.747 


.454 


.140 


500 


IPFA 


.063 


.075 


.066 


.138 


.154 


.043 




ALPHA 


.063 


.076 


.066 


.140 


.156 


.044 




ML 


.063 


.077 


.066 


.140 


.157 


.044 




GLS 
















MINRES 


.063 


.075 


.066 


,137 


.154 


.043 




NOHARM 


.063 


.079 


.066 


.155 


.193 


.045 




TESTFACT 


.073 


.103 


.089 


.161 


.172 


.053 




MAXLOG 


.154 


.194 


.159 


.670 


.393 


.130 


lOOO 


IPFA 


.043 


.055 


.045 


.104 


.118 


.031 




ALPHA 


.044 


.056 


.045 


.108 


.120 


.031 




ML 


.043 


.055 


.045 


.104 


.120 


.031 




GLS 
















MINRES 


.043 


.055 


.045 


.104 


.118 


.031 




NOHARM 


.043 


.054 


.045 


.107 


,131 


.031 




TESTFACT 


.056 


.095 


,075 


.137 


.126 


.044 




MAXLOG 


.103 


.165 


.107 


.552 


.376 


.094 



er|c 
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Table 7 

Mean squared residuals for different generated and assumed 
dimensionality of the data and various methods with N = 250 



Generated Assumed Dimensionality 

D imens i on— 

Method ality 12 3 4 



PCA 0 .0184 

1 .0111 

2 .0354 

3 .0347 

MINRES C ,0154 

1 .0098 

2 .0338 

3 .0331 

NOHARM 0 .0072 

1 ,0038 

2 .0169 

3 .02^4 

TESTFACT 0 ,0150 

1 .0072 

2 .0288 

3 .0256 



.0163 


,0147 


.0125 


.0101 


.0081 


.0061 


.0048 


.0039 


.0102 


.0072 


.0056 


.0041 


.0165 


.0060 


,0045 


.0034 


.0111 


,0081 


,0056 


.0040 


.0064 


.0043 


.0031 


.0022 


.0083 


,0053 


.0037 


.0027 


.0149 


.0046 


.0032 


,0024 


.0042 


.0029 


,0022 


.0015 


.0024 


,0013 


.0007 


,0004 


,0023 


.0015 


,0010 


,0006 


.0U2 


.0027 


.0017 


,0011 


.0107 


,0077 


.0054 


.0039 


.0050 


,0036 


.0031 


,0022 


.0079 


.0045 


,0029 


.0020 


r013l 


.0057 


.0037 


.0039 
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Table 8 

Mean squared residuals for different generated and assiimed 
dimensionality of the data and various methods with N = 500 



Generated Assumed Dimensionality 

Dimension- 

Method ality 1 2 3 4 5 



PCA 


0 


.0129 


.0129 


.0121 


.0118 


.0112 




1 


.0073 


.0054 


.0045 


.0037 


.0033 




2 


.035Q 


.0064 


.0048 


.0038 


.0031 




3 


.0275 


.0135 


.0038 


.0029 


.0023 


MINRES 


0 


.0096 


.0073 


.0050 


.0036 


.0024 




1 


.0061 


.0038 


.0027 


.0019 


.0013 




2 


.0334 


.0046 


.0029 


.0020 


.0014 




3 


,0260 


,01J9 


.0024 


.0016 


.0012 


N0HA2M 


0 


.0038 


.0026 


.0016 


.0011 


.0006 




1 


.0018 


.0011 


.0007 


.0005 


.0003 




2 




.0013 


.0008 


.0005 


.0003 




3 


.0218 


.0094 


.0011 


.0007 


.0004 


TESTFACT 


0 


.0095 


.0069 


.0050 


.0036 


.0026 




1 


.0061 


.0039 


.0025 


.0015 


.0010 




2 


.0303 


.0057 


.0034 


.0021 


.0015 




3 


.022? 


.0104 


.0031 


.0021 


.0012 
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Table 9 

Mean squared residuals for different generated and assumed 
dimensionality of the data and various methods with N = 1000 



Method 


Generated 
ality 




Assumed Dimensionality 




1 


2 


3 


4 


5 


?Ck 


0 


.0080 


.0090 


.0100 


.0106 


.0113 




1 


.0042 


.0040 


.0039 


.0036 


.0033 




0 




.0044 


.0037 


.0031 


.0029 




3 


.0298 


,0153 


.0027 


.0021 


.0018 


MINRES 


0 


.0049 


0036 


. UU CO 


nn 1 fl 

. UU 10 


.0012 




1 


.0030 


.0021 


.0018 


.0011 


.0008 




2 


.0290 


.0026 


.0017 


.0011 


.0008 




3 


.0283 


.013^ 


.0014 


.0009 


.0006 


NOHAKM 


0 


.0017 


.0012 


.0008 


.0006 


.0004 




1 


.0009 


.0005 


.0003 


.0002 


.0001 




2 


• 0159 


.0008 


.0005 


.0003 


,0002 




3 


.0225 


• 0105 


.0008 


.0004 


,0003 


TESTFACT 


0 


,0050 


.0036 


.0026 


.0018 


.0013 




1 




. 0019 


.0015 


.0011 


,0008 




2 


.0283 


.0027 


.0017 


.0011 


.0008 




3 


.0245 


.0124 


.0024 


.0014 


.0010 




T i tles of recent Researeh Reports fmm t he Divi..;ir>n r.f 
Educational Measurem ent and Data Analysi..; 
Pniversitv of Twe nte. RnsrV^ Pr^o, 
The Nether! anrf..; 



RR-87-1 



RR-87-2 



RR-87-3 



RR-87-4 



RR-87-5 



RR-87-6 



RR-87-7 



RR-87-8 



RR-87-9 



RR-87-10 



ERIC 



R. Engelen,, Semiparametric estimation in the Rasch 
model 

W.J. van der Linden (Ed.), IRT-based test construc- 
tion 

R. Engelen, P. Thommassen. & W. Vervaat, Ignatov's 
theorem: A new and short proof 

E. van der Burg, & J. de Leeuw. Use of the multino- 
mial jackknife and bootstrap in generalized non- 
linear canonical correlation analysis 
H. Keldern:an. Estimating a quasi-loglinear models 
for the Rasch table if the number of items is large 
R. Engelen. A review of different estimation proce- 
dures in the Rasch model 

D. L. Knc-1 i J.M.F. ten Berge. Least-squares 
approximation of an improper by a proper corre- 
lation matrix using a semi-infinite convex: program 

E. van der Burg & J. de Leeuw, Nonlinear canonical 
correlation analysis with k sets of variables 

W.J. van der Linden. Applications of decision 
theory to test-based decision making 
W.J. van der Linden & E. Boekkooi-Timminga, A 
maximin model for test design with practical con- 
straints 

44 



ERIC 



E. van der Burg & J. de Leeuw. Nonlinear redundancy 
eoialysis 

RR-88-2 W.J. van der Linden & J. J.. Adema, Algorithmic test 

design usirg classical item parameters 
RR-88-3 E. Boekkooi-Timminga, A cluster-based method for 

test construction 
RR— 88— 4 J.J. Adema, A note on solving large-scale zero-one 

programming problems 
RR-88-5 W.J. van der Linden. Optimizing incomplete sample 

designs for item response model parameters 
RR— 88— 6 H.J. Vos, The use of decision theory in the 

Minnesota Adaptive Instructional System 
RR-88-7 J. H.A.N. Rikers . Towards an authoring system for 

item construction 
RR-88-8 R.J.H. Engelen & W.J. van der Linden, Item 

information in the Rasch model 
RR-88-9 W.J. van der Linden & T.J.H.M. Eggen, The Rasch 

model as a model for paired comparisons with an 

individual tie parameter 
RR-83-10 H. Kelderman & G. Macready, Loglinear-latent-class 

models for detecting item bias 
RR-88-11 D.L. Knol & M.P.F:. Berger, Empirical Comparison 

between Factor Analysis and Item Response Models 

Research Reports can be obtained at costs from 
Bibliotheek, Department cf Education, University of 
Twente, P.O. Box 217, 7500 AE Pnschede. The 
Netherlands . 

45 



I 



I 




ERIC 



fmenf of 

UCAtlON 



• ■ • 



A publication by 
the Department of Education 
of the Ujiiversity oi Twente 



